optimization procedure
A General Method for Amortizing Variational Filtering
We introduce the variational filtering EM algorithm, a simple, general-purpose method for performing variational inference in dynamical latent variable models using information from only past and present variables, i.e. filtering. The algorithm is derived from the variational objective in the filtering setting and consists of an optimization procedure at each time step. By performing each inference optimization procedure with an iterative amortized inference model, we obtain a computationally efficient implementation of the algorithm, which we call amortized variational filtering. We present experiments demonstrating that this general-purpose method improves inference performance across several recent deep dynamical latent variable models.
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
Gradient-based grand canonical optimization enabled by graph neural networks with fractional atomic existence
Christiansen, Mads-Peter Verner, Hammer, Bjørk
Machine learning interatomic potentials have become an indispensable tool for materials science, enabling the study of larger systems and longer timescales. State-of-the-art models are generally graph neural networks that employ message passing to iteratively update atomic embeddings that are ultimately used for predicting properties. In this work we extend the message passing formalism with the inclusion of a continuous variable that accounts for fractional atomic existence. This allows us to calculate the gradient of the Gibbs free energy with respect to both the Cartesian coordinates of atoms and their existence. Using this we propose a gradient-based grand canonical optimization method and document its capabilities for a Cu(110) surface oxide.
- North America > United States (0.04)
- Europe > United Kingdom > Wales (0.04)
- Europe > Denmark (0.04)
- Asia > Japan (0.04)
Reducing Reparameterization Gradient Variance
Andrew Miller, Nick Foti, Alexander D'Amour, Ryan P. Adams
Optimization with noisy gradients has become ubiquitous in statistics and machine learning. Reparameterization gradients, or gradient estimates computed via the "reparameterization trick," represent a class of noisy gradients often used in Monte Carlo variational inference (MCVI). However, when these gradient estimators are too noisy, the optimization procedure can be slow or fail to converge. One way to reduce noise is to generate more samples for the gradient estimate, but this can be computationally expensive. Instead, we view the noisy gradient as a random variable, and form an inexpensive approximation of the generating procedure for the gradient sample. This approximation has high correlation with the noisy gradient by construction, making it a useful control variate for variance reduction. We demonstrate our approach on a non-conjugate hierarchical model and a Bayesian neural net where our method attained orders of magnitude (20-2,000) reduction in gradient variance resulting in faster and more stable optimization.
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
A General Method for Amortizing Variational Filtering
We introduce the variational filtering EM algorithm, a simple, general-purpose method for performing variational inference in dynamical latent variable models using information from only past and present variables, i.e. filtering. The algorithm is derived from the variational objective in the filtering setting and consists of an optimization procedure at each time step. By performing each inference optimization procedure with an iterative amortized inference model, we obtain a computationally efficient implementation of the algorithm, which we call amortized variational filtering. We present experiments demonstrating that this general-purpose method improves inference performance across several recent deep dynamical latent variable models.
Neyman-Pearson Classification under Both Null and Alternative Distributions Shift
Kalan, Mohammadreza M., Deng, Yuyang, Neugut, Eitan J., Kpotufe, Samory
We consider the problem of transfer learning in Neyman-Pearson classification, where the objective is to minimize the error w.r.t. a distribution $μ_1$, subject to the constraint that the error w.r.t. a distribution $μ_0$ remains below a prescribed threshold. While transfer learning has been extensively studied in traditional classification, transfer learning in imbalanced classification such as Neyman-Pearson classification has received much less attention. This setting poses unique challenges, as both types of errors must be simultaneously controlled. Existing works address only the case of distribution shift in $μ_1$, whereas in many practical scenarios shifts may occur in both $μ_0$ and $μ_1$. We derive an adaptive procedure that not only guarantees improved Type-I and Type-II errors when the source is informative, but also automatically adapt to situations where the source is uninformative, thereby avoiding negative transfer. In addition to such statistical guarantees, the procedures is efficient, as shown via complementary computational guarantees.
- North America > United States (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)